Randomness versus specifics for word-frequency distributions
نویسندگان
چکیده
منابع مشابه
A Stochastic Process for Word Frequency Distributions
A stochastic model based on insights of Mandelbrot (1953) and Simon (1955) is discussed against the background of new criteria of adequacy that have become available recently as a result of studies of the similarity relations between words as found in large computerized text corpora. FREQUENCY DISTRIBUTIONS Various models for word frequency distributions have been developed since Zipf (1935) ap...
متن کاملDensity Distributions in Random Binary Sequences Introduction: Frequency and Randomness
We start from the notion of the density of a word in a binary sequence. Given a xed binary word w of length n, let g(m; k) denote the number of distinct binary words v of length m containing just k occurences of w. An extended Central Limit Theorem can be applied to conclude that as m ! 1 so g(m; k) p 2 ?n m=2 m becomes approximately normal with mean 2 ?n m and variance s2 ?n m. We give a recur...
متن کاملzipfR: Word Frequency Distributions in R
We introduce the zipfR package, a powerful and user-friendly open-source tool for LNRE modeling of word frequency distributions in the R statistical environment. We give some background on LNRE models, discuss related software and the motivation for the toolkit, describe the implementation, and conclude with a complete sample session showing a typical LNRE analysis.
متن کاملEnumerable Distributions, Randomness, Dependence
Kolmogorov-Martin-Löf Randomness concept is extended from computable to enumerable distributions. This allows definitions of various other properties, such as mutual information in infinite sequences. Enumerable distributions (as well as distributions faced in some finite multi-party settings) are semimeasures; handling those requires some amount of care.
متن کاملExtracting Randomness from Samplable Distributions
The standard notion of a randomness extractor is a procedure which converts any weak source of randomness into an almost uniform distribution. The conversion necessarily uses a small amount of pure randomness, which can be eliminated by complete enumeration in some, but not all, applications. Here, we consider the problem of deterministically converting a weak source of randomness into an almos...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Physica A: Statistical Mechanics and its Applications
سال: 2016
ISSN: 0378-4371
DOI: 10.1016/j.physa.2015.10.082